heliaCORE

Optimized AI kernels for Ambiq Silicon

Kernel acceleration for Ambiq AI.

ATfE + MVE ready

heliaCORE is Ambiq's optimized neural-network kernel library for Ambiq silicon. This repository delivers it as ns-cmsis-nn: a CMSIS-NN-based kernel layer with Ambiq-tuned operators, CMSIS-Pack delivery, and HELIA integration paths for Apollo-class Cortex-M accelerators.

HELIA COREV7.25
200+accelerated ops
40+Field models
53Op types
4Paths
find_package(ns-cmsis-nn REQUIRED CONFIG)
target_link_libraries(app PRIVATE nsx::cmsis_nn)
CMSIS-Pack Zephyr CMake neuralSPOT-X ATfE GCC Cortex-M0+ Cortex-M4 Cortex-M55 Apollo510

What it is

A kernel layer for HELIA AI workloads

heliaCORE is Ambiq’s optimized neural-network kernel layer for Apollo-class devices. In this repository, it is delivered as ns-cmsis-nn: inherited CMSIS-NN-compatible APIs where they apply, Ambiq-tuned kernels where HELIA needs more coverage, and packaging for runtimes, compilers, and firmware builds.

Ambiq workload coverage

Coverage for real Ambiq model graphs

Arm CMSIS-NN provides the trusted foundation for efficient neural-network kernels on Cortex-M. In Ambiq’s HELIA workflows, internal profiling across field-like models also highlighted important Ambiq-specific coverage needs: real graphs spend measurable time in padding, activations, reductions, and other operators around the largest MAC-heavy layers.

heliaCORE broadens Cortex-M accelerator coverage around the CMSIS-NN-compatible foundation so those end-to-end paths stay optimized on Ambiq silicon.

MVE MVE-first where available

Cortex-M55 paths are a primary optimization target, with vectorized kernels where MVE can improve end-to-end latency.

DSP DSP coverage for Apollo-class MCUs

Cortex-M DSP paths remain important on Apollo targets without MVE, so DSP variants stay part of the release surface.

Flow Glue operators count too

Coverage extends beyond obvious MAC-heavy layers to the graph operators that shape real HELIA deployments.

40+ Ambiq field-like suite 40+ models, 53 operator types, 247 unique operators, and 963 operator instances.
Tiny MLPerf Tiny baseline 5 models, 7 operator types, 34 unique operators, and 80 operator instances.
200+ Expanded coverage 200+ accelerated operators for Ambiq silicon, plus additional variants for existing operators.

Explore Cortex-M accelerators Browse operator & kernel coverage

Pick your integration

Four supported ways in

HELIA family

Powering heliaRT and heliaAOT

heliaCORE provides the optimized kernel layer that powers Ambiq’s edge-AI runtime and compiler flows. heliaRT and heliaAOT both rely on optimized HELIA kernels where applicable; heliaCORE is the shared kernel foundation behind those paths.

Arm CMSIS ecosystem

heliaCORE is built on and distributed for the Arm CMSIS ecosystem, including CMSIS-NN-compatible APIs and CMSIS-Pack delivery. Ambiq-specific additions are intended to ease integration into the HELIA AI platform for Ambiq silicon. For vendor-neutral Cortex-M kernel work, Arm CMSIS-NN remains the upstream ecosystem reference.